Stability of Bagged Decision Trees

نویسنده

  • Yves Grandvalet
چکیده

Riassunto: Il Bagging è una tecnica di aggregazione, in cui uno stimatore viene ottenuto come media di predittori calcolati su campioni bootstrap. Gli alberi di decisione con il bagging quasi sempre migliorano il predittore originario, ed è opinione comune che l’efficacia del bagging sia dovuta alla riduzione della varianza. In questo lavoro mostriamo un contro-esempio e diamo evidenza sperimentale al fatto che il bagging stabilizza la predizione bilanciando l’influenza delle unità di training. Unità molto influenti vengono pesate di meno a causa della loro assenza in alcuni dei campioni di bootstrap. Abbiamo quindi testato empiricamente alcune recenti teorie che mettono in relazione stabilità ed errore di generalizzazione. L’ipotesi di stabilità è stata valutata in base a diversi benchmark, e mostriamo come il processo di bilanciamento dell’influenza dei singoli esempi migliora significativamente la stabilità, che a sua volta può migliorare la capacità di generalizzazione.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using HMMs and bagged decision trees to leverage rich features of user and skill from an intelligent tutoring system dataset

This article describes the user modeling, feature extraction and bagged decision tree methods that were used to win 2 nd place student prize and 4 th place overall in the ACM’s 2010 KDD Cup.

متن کامل

Improved Class Probability Estimates from Decision Tree Models

Decision tree models typically give good classification decisions but poor probability estimates. In many applications, it is important to have good probability estimates as well. This paper introduces a new algorithm, Bagged Lazy Option Trees (B-LOTs), for constructing decision trees and compares it to an alternative, Bagged Probability Estimation Trees (B-PETs). The quality of the class proba...

متن کامل

An Empirical Evaluation of Supervised Learning for ROC Area

We present an empirical comparison of the AUC performance of seven supervised learning methods: SVMs, neural nets, decision trees, k-nearest neighbor, bagged trees, boosted trees, and boosted stumps. Overall, boosted trees have the best average AUC performance, followed by bagged trees, neural nets and SVMs. We then present an ensemble selection method that yields even better AUC. Ensembles are...

متن کامل

Accurate estimation of retinal vessel width using bagged decision trees and an extended multiresolution Hermite model

We present an algorithm estimating the width of retinal vessels in fundus camera images. The algorithm uses a novel parametric surface model of the cross-sectional intensities of vessels, and ensembles of bagged decision trees to estimate the local width from the parameters of the best-fit surface. We report comparative tests with REVIEW, currently the public database of reference for retinal w...

متن کامل

Bagging Soft Decision Trees

The decision tree is one of the earliest predictive models in machine learning. In the soft decision tree, based on the hierarchical mixture of experts model, internal binary nodes take soft decisions and choose both children with probabilities given by a sigmoid gating function. Hence for an input, all the paths to all the leaves are traversed and all those leaves contribute to the final decis...

متن کامل

Bagging tree classifiers for laser scanning images: a data- and simulation-based strategy

Diagnosis based on medical image data is common in medical decision making and clinical routine. We discuss a strategy to derive a classifier with good performance on clinical image data and to justify the properties of the classifier by an adapted simulation model of image data. We focus on the problem of classifying eyes as normal or glaucomatous based on 62 routine explanatory variables deri...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006